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ABSTRACT 

General multilevel nonlinear optimization problems arise in design of complex systems and can 
be used as a means of regularization for multicriteria optimization problems. Here for clarity in 
displaying our ideas, we restrict ourselves to general bilevel optimization problems, and we present 
two solution approaches. Both approaches use a trust-region globalization strategy, and they can be 
easily extended to handle the general multilevel problem. We make no convexity assumptions, but 
we do assume that the problem has a nondegenerate feasible set. We consider necessary optimality 
conditions for the bilevel problem formulations and discuss results that can be extended to obtain 
multilevel optimization formulations with constraints at each level. 
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1 Introduction 


We are interested in nonlinear multilevel optimization (MLO) problems, in general, and bilevel 
optimization (BLO) problems, in particular, for two related and important reasons. First, gen- 
eral multilevel optimization problems arise in the course of decomposition of multidisciplinary 
design optimization problems (see, for example, Sobieszczanski-Sobieski , James, and Dovi [15], 
Sobieszczanski-Sobieski, James, and Riley [16], Sobieszczanski-Sobieski [14], Barthelemy [4], Padula 
and Young [10]). 

The other, related, application is the field of multicriteria (or multiobjective, or vector) op- 
timization. Design of any feature of a complex system involves achieving a compromise among 
several, possibly competing, objectives. For example, aeronautical design objectives include such 
criteria as minimizing weight for a given performance, maximizing lift, finding the shape with least 
drag, achieving the least time trajectory between two points, and other objectives. 

There are several current approaches to solving multicriteria optimization problems. One ap- 
proach is to introduce a single criterion that somehow incorporates the many criteria of the problem 
(see, e.g., Wood [19]). Another technique uses the notion of Pareto optimality to achieve a balance 
between the objectives (see, for example, Sawaragi, Nakayama, and Tanino [12]). The approach 
of goal programming selects one objective to serve as an optimization objective and turns the 
other objectives into constraints by setting bounds or “goals” for them. Finally, a subset of multi- 
level problems, known as lexicographic optimization problems, involves the notion of lexicographic 
comparison; see, for instance, Ben-Israel, Ben-Tal, and Zlobec [5]. 

We consider yet another approach, namely, to restate the multiobjective problem as a multi- 
level optimization problem. This approach has not been extensively used because, to the authors’ 
knowledge, efficient algorithms for general nonlinear multilevel optimization have not yet been dis- 
covered. In addition, due to the theoretical complexity of the problem, a theoretical basis for the 
general problem has not been developed as yet. The general problem of MLO follows. 

Let / TO » • ■ ■ »/i be the problem objectives, arranged in the order of increasing significance, i.e., 
ft is the most important objective, while f m is the least important objective. Note that this 
significance does not need to be quantified in any way other than the establishment of the order. 
Then the formulation is: 

Problem MLO: 

minimize 

subject to x m € argmin / m -i(x m -i) 

subject to X 2 6 argmin /i(xi), 

where “argmin” denotes the set of minima of a particular ft, and ft , . . . , f m : 9? n — * ► Si are sufficiently 
smooth. The formulation can be easily extended to include constraints, but in the scope of this 
discussion we shall address unconstrained objectives. 

Thus the MLO formulation gives us a way to “regularize” the ill-defined problem of multicriteria 
optimization. Here the engineering insight will enter into defining the order in which the optimiza- 
tion levels are stated. The reader will see that if the objective values of all the objectives except 
the most important one were known at a solution to an MLO problem, then one would have a goal 
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program. Thus, it is possible to view MLO as related to goal programming except that the order 
of importance of the objectives needs to be specified rather than goals for each objective value. 

Existing work in multilevel and, in particular, bilevel optimization (see Vicente and Calamai 
[18] for a review) deals only with functions under extremely strong assumptions of convexity and 
has many theoretical difficulties. We are approaching this problem with cautious optimism because 
in recent work (see Alexandrov [1], Alexandrov and Dennis [2], [3]), algorithms for multilevel 
optimization of problems with special structure have been shown to exhibit global convergence 
under reasonable assumptions. 

In this paper we will use the general bilevel optimization problem to discuss issues in MLO. 
First, we remark in passing that for the two objectives, there are two different problems determined 
by the assignment of the order to the criteria: 

Problem 1: minimize /i(x) 

subject to x € argmin {/ 2 (y)}, 

and 

Problem 2: minimize / 2 (x) 

subject to x € argmin {/i(y)}. 

The two problems will almost certainly have different answers. In fact, there are simple exam- 
ples of one problem being well-posed, while the other is ill-posed. We contend that engineering 
judgement and insight into the problem is likely to produce a correct or optimal order. In the 
contrary case, establishing the right order is likely to lead to engineering insight. 

We are proposing two formal algorithms for the bilevel and multilevel optimization problems. 
One algorithm is an extension of the multilevel algorithms in Alexandrov [1] and it arrives from 
the current approximation of the solution to the next approximation by computing a sequence of 
solutions to the minimization subproblems restricted to smaller and smaller dimensional subspaces. 
The second algorithm arrives at the next estimate of the solution by solving a sequence of local 
optimization subproblems each of which will serve to set a local “goal” in defining the region of 
sufficient decrease in the merit function for the final local optimization subproblem. We have a 
new, promising, merit function that will allow us to evaluate the progress of the algorithm toward 
a solution. 

2 Nonlinear Programming Preliminaries 

In this section we define a number of concepts from unconstrained optimization, that enter both 
into practical conditions imposed on the steps in an optimization algorithm and into algorithm con- 
vergence analysis. We also briefly describe the multilevel algorithms for nonlinear optimization on 
which the algorithms proposed here are based. Consider the following unconstrained minimization 
problem. 

Problem UNC: 

minimize f(x) 

subject to x € 8? n , 
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where / : — * 9ft is continuously differentiable. 

Newton’s method and its variations form a standard class of local solution methods for UNC, 
and they can be stated as follows: 

1. Initialize; 

2. Do until convergence: 

Build a local model: 

4>c{») = /(* c ) + V/(x c ) T s + \s t H c s\ 

Minimize <f> c ($) to obtain s c ; 

Set x_|_ — x c “I - So 

3. End. 

Here x c and x+ denote the current and the next approximation to a solution, respectively, and H c 
is an approximation to the Hessian of / at x c , but not necessarily the true Hessian. 

Trust region algorithms form one of the major approaches designed to improve the global be- 
havior of such local model based algorithms. At each iteration, a typical trust-region algorithm for 
solving problem UNC finds a trial step by solving the following trust-region subproblem approxi- 
mately: 

minimize f(x c ) + V f(x c ) T s + H c s 
subject to ||*|| < 

where 6 C > 0 is the trust-region radius, and || * || denotes the t% norm. The idea is to model the 
objective function in a restricted region and to accept the trial step when the quadratic model 
adequately predicts the behavior of the function, and to recompute the step in a smaller region if 
it does not. 

Detailed treatment of the trust-region approach to unconstrained optimization and nonlinear 
equations can be found in Dennis and Schnabel [6], Sorensen [17], More [8], More and Sorensen [9], 
Powell [7], and Shultz, Schnabel and Byrd[13]. 

Trust-region algorithms have been successfully extended to solve the general nonlinear con- 
strained optimization problem. In particular, the local step in the successive quadratic program- 
ming (SQP) method is found by computing a minimizer of the quadratic model of the Lagrangian 
at the current point, subject to linearized constraints. A trust-region algorithm based on SQP 
adds the trust-region constraint to the subproblem and additional constraints designed to ensure 
that the trust-region constraint and the linearized constraints are consistent. We shall see that the 
algorithms proposed here may be viewed as a generalization of the SQP approach to bilevel and 
multilevel optimization. 

2.1 Merit Functions 

In order to evaluate a trial step, trust-region algorithms use merit functions, which are functions 
related to the problem in such a way that the improvement in the merit function signifies progress 
toward the solution of the problem. 

For unconstrained minimization, a natural choice for a merit function is the objective function 
itself. Let 

M s ) = /(*c) + v/(x c ) r s + -s t H c s 
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denote the quadratic model of the merit function. We define two related functions. 

The actual reduction is defined as 

ared c (s c ) = f(x c ) - f(x c + s c ), 
and the predicted reduction is defined as 

pred c (s c ) = 4> c { 0) - <f> c (s c ) 

~ ~ ^/(®c) ( 5 c) ~ 2 

so that the predicted reduction in the merit function is an approximation to the actual reduction 
in the merit function. 

The standard way to evaluate the trial step in trust-region methods is to consider the ratio of 
the actual reduction to the predicted reduction. A value lower than a small predetermined value 
causes the step to he rejected. Otherwise the step is accepted. 

For nonlinear systems of equations, the norm of the residuals serves as a merit function. For the 
constrained optimization, the merit function is some expression that involves both the objective 
function and the constraints. 


2.2 Fraction of Optimal Decrease and Fraction of Cauchy Decrease 


To assure global convergence of a trust-region algorithm for problem UNC, the trial step is required 
to satisfy a fraction of Cauchy decrease (FCD) condition. This mild condition means that the 
trial step, $ CJ must predict at least a fraction of the decrease predicted by the Cauchy step, which 
is the steepest descent step for the model within the trust region. We must have for some fixed 
/ci > 0 


pred(s c ) = 4> c (s c ) - <£ c (0) < *i[<k( 5 f P ) ~ &(0)], 


where 


A CP 


— -a 


CP 


{ 


■ IW , (»c)H 3 

a CP _ l Vf(x c )‘l'H c Vf(T c ) 

l|V/f*e)|| 


V/(* c ) with 

if , <s 

11 V J{x c )‘ HcV J(xt) - ° c 

otherwise. 


See Dennis and Schnabel [6], pp. 139 — 141, for details on the Cauchy point. 

A stronger condition, the fraction of optimal decrease property (FOD), allows one to prove 
stronger convergence results. A step s c is said to satisfy FOD if it predicts at least a fraction of 
the decrease predicted by the optimal solution of the trust-region subproblem, i.e., for some fixed 
*2 > 0 we have 

pred(s c ) = <t> c (s c ) - <£ c (0) < k 2 [<£c(^ PT ) - <k(0)], 


where s° PT solves the trust-region subproblem exactly. 

The FCD condition is satisfied by all variants of the dogleg method and by restricted subspace 
methods, for example. The stronger FOD condition is satisfied by most algorithms that attempt 
to accurately minimize the local model on the trust region, for instance, by Levenberg-Marquardt 
type methods. 
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2.3 Convergence Results 

Powell’s global convergence theorem (see Powell [7]) for any unconstrained minimization trust- 
region algorithm shows the power of trust-region globalization ideas. It states that if / is uniformly 
continuously differentiable and {#,} are only assumed to be uniformly bounded, then the sequence 
of iterates generated by a FCD trust-region algorithm is well-defined and satisfies 

liminf ||V/(x t )|| = 0. 

t— KX> 

Sorensen [17] has shown stronger convergence results for trust-region algorithms with steps that 
satisfy FOD. Specifically, he has shown that if the Hessian is Lipschitz continuous, and if exact 
Hessians are used in the local models, then any limit point of the iterates satisfies second order 
necessary conditions, i.e., has a positive semidefinite Hessian. Furthermore, under some reasonable 
additional assumptions, the iteration sequence converges q-quadratically to a second order necessary 
point for UNC. 

Detailed treatment of the unconstrained minimization theory and practice can be found in More 
[8], More and Sorensen [9], Sorensen [17], and Shultz, Schnabel and Byrd [13]. 

2.4 Multilevel Methods for Nonlinear Equations Equality Constrained Opti- 
mization 

The algorithms introduced here are based on the recently proposed class of multilevel algorithms 
for equality constrained optimization and nonlinear equations (see Alexandrov [1], Alexandrov and 
Dennis [2], [3]). 

The algorithms of that class use trust regions as a globalization startegy, they have been shown 
to be globally convergent under reasonable assumptions. They have the following characteristics: 

• The constraints of the problem can be partitioned into blocks by the user in any manner 
suitable to an application, or in any arbitrary manner at all. 

• The analysis of the methods assumes certain standard smoothness and boundedness proper- 
ties, but no other assumptions are made on the structure of the problem. 

• The algorithms solve at each iteration progressively smaller dimensional subproblems to arrive 
at the trial step. 

• The trial steps computed by the algorithm are required to satisfy very mild conditions, both 
theoretically and computationally. In fact, the substeps comprising the trial step can be 
computed in the subproblems using different optimization algorithms. The substeps are 
only required to satisfy a mild decrease condition for the subproblems and a reasonable 
boundedness condition — both satisfied in practice by most methods of interest. 

The proposed multilevel class of algorithms differs from the conventional algorithms in that its 
major iteration involves computing an approximate solution of not one model over a single restricted 
region, but of a sweep of models, each approximately minimized over its own restricted region. Each 
model approximates a block of constraints and, finally, the objective function, restricted to certain 
subspaces. The algorithms proposed in this work follow this principle with equality constraints 
replaced by one or more levels of optimization problems. 
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3 Formulations and Algorithms 


In this section we consider some formulations of the bilevel problem 

Problem BLO: 

minimize hi 1 ) 

subject to x € argmin {/j(x)} 

and discuss their properties, including necessary conditions for minima. Then we suggest algorithms 
suitable for the specific formulations. 

In our discussions we assume no convexity, unless specified otherwise. We assume that all 
functions are at least twice continuously differentiable and that f\ is bounded from below. 

The formulation, which we call BLO, means that among the minima of f\ we wish to find a 
point, for which the value of is the lowest. There are three cases. 

1. /i is strictly convex. There is one global minimizer of /i, and, therefore, the feasible point is 
the solution of the problem. 

2. The set of minima of /i is a set of disjoint points. Since algorithms for continuous nonlinear 
optimization are guaranteed, in general, to find only local solution, this case, in effect, is 
identical to the first one. 

3. The set of minima of f\ has a nonempty relative interior. 

Since the first two cases are degenerate as bilevel problems, we shall consider only the third one 
from now on. 

Suppose the point x* 6 & n solves the innermost problem of problem BLO, i.e., x, is an un- 
constrained minimizer of fi(x). Let /j* be the corresponding value of f\. Then our problem BLO 
would seem equivalent to the following problem: 

minimize fo{x) 
subject to /i(x) = /*• 

However, this formulation will not have a Lagrange multiplier at the minimum because V/i(x*) = 
0 and thus the first order necessary conditions will hold only if V / 2 (x«) = 0 coincidentally. There- 
fore, the problem is ill-posed in this form. 

3.1 Approach Based on First Order Necessary Conditions for a Solution 

Now consider the following formulation based on the first order necessary condition for the inner- 
most problem: 

Problem FOC: 

minimize fi{x) 
subject to V fi(x) = 0. 

Clearly, for a convex f\ the formulations BLO and FOC are equivalent. To study the relation 
between the formulation in the general case, we introduce the notion of constraint qualifications. 
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In order to determine optimality in constrained optimization, it is necessary to study the behav- 
ior of the objective function along feasible perturbations. Conditions that allow us to characterize 
feasible perturbations completely are known as constraint qualifications. Constraint qualifications 
may take different forms, some of them purely theoretical. A common practical constraint qualifi- 
cation in nonlinear programming is regularity, which is the assumption of full rank for the Jacobian 
of the constraint system. As we mentioned, regularity fails for the most obvious reformulation of 
problem BLO. 

For problem FOC, the Jacobian of the constraint system is V 2 /i(x). It is a square matrix, 
positive semidefinite at a solution of problem BLO. We assume that the matrix is singular, for 
otherwise the inner problem would have an isolated minimum, resulting in the degenerate case. We 
claim that a reasonable constraint qualification for problems FOC and BLO is to require V 2 /i(x) 
to have constant rank in a neighborhood of the solution. This assumption is a natural extension 
of the full-rank assumption for rectangular matrices and is based on the results in continuity of 
generalized inverses (see Campbell and Meyer [11], for example). 

Let x* solve the bilevel optimization problem BLO. Assuming the constant rank constraint 
qualification, it can be shown that the first order necessary conditions for an optimum of problem 
BLO and problem FOC is: 

V/ 2 (x,) + A t V 2 /!(x.) = 0; 

V/,(*.) = 0; 

V 2 /i(x*) is positive semidefinite. 

We believe that adding the condition of positive semidefiniteness of V 2 / 2 (x») on the null space 
of V 2 /i(x„) to the above conditions together with our constraint qualification will constitute the 
second order necessary optimality conditions for problem BLO. 

We also believe that for general nonlinear /i and / 2 , if x, solves problem FOC and it is feasible 
for problem BLO, then it also solves problem BLO. 

Thus, it is reasonable to attack problem BLO by solving problem FOC if we ensure that the 
solution is a minimum of /i and therefore feasible for BLO. In practice, we propose to solve problem 
FOC by the multilevel algorithm for equality constrained optimization introduced in Alexandrov 
[1] (see also Alexandrov and Dennis [2], [3]). To measure progress toward a solution and to ensure 
that it is feasible with respect to problem BLO, we propose to attempt two merit functions: 

P\(x; p) = f 2 (x) + pfi(x) 

and 

fMx; P) - h{x) + A T V/i(x) + pfi(x) 2 . 

The first merit function is an analog of the objective function used as a merit function in uncon- 
strained minimization. The second one is an analog of the augmented Lagrangian used as a merit 
function in constrained optimization. 

A possible drawback of this approach is that second order information may be necessary for the 
for the algorithm. On the positive side, the analysis of the multilevel algorithms for constrained 
optimization [1] will apply to the approach after minor modifications. Both in theory and in 
practice, the steps would have to satisfy the mild FCD condition for the subproblem that they solve. 
In addition, this formulation is easy to extend to the general multilevel optimization problem. 
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3.2 Approach Based on Successive Decrease Conditions 

Now that we have the first order necessary conditions for a solution of problem BLO, let us consider 
an approach that will require no explicit reformulation of the problem. 

Sorensen [17] has shown that if we use the exact Hessians and steps that satisfy the FOD 
condition in a trust-region algorithm for unconstrained minimization, then the algorithm converges 
to a point that satisfies the first order necessary conditions for a minimum. Thus, it is reasonable 
to expect — though it must be verified — that if we apply an FOD method to our problem, we should 
have convergence to a point satisfying first order necessary conditions. 

The algorithm we propose for bilevel optimization can be stated as follows: 

Compute the trial step for problem BLO to produce an FOD on the quadratic model of f 2 subject 
to producing FOD on the quadratic model of f \ . 

A version that imposes a milder FCD type condition on the step is also of interest. 

In practice, the algorithm would be implemented in the following way. The inner problem would 
be solved by a conventional trust-region approach to unconstrained minimization to produce the 
FOD “goal” for the quadratic model of f\ about the current point. Then the outer problem would 
be solved in the null space of V 2 /i(z c ) subject to the condition that the step produce the FOD 
condition in the model of / 2 . 

This approach can be extended to any number of levels in a natural way. Clearly, if the objective 
values of all the objectives except the most important one were known at a solution to an MLO 
problem, then one would have a goal program. One can think of our algorithm as a way to set 
goals adaptively for each iteration. 

We propose to use the same two merit functions as in the previous subsection. 


4 Concluding Remarks 

We proposed two approaches to solving the bilevel optimization problem, which can be easily 
extended to general multilevel problem with an arbitrary number of levels and with constraints. 

The main difficulties of the multilevel formulations have always been the possible intractability of 
the feasible set for the problem and in showing the existence of search directions under reasonable 
assumptions. We proposed a constraint qualification which is a reasonable extension of the standard 
constraint qualification for constrained nonlinear optimization. This qualification has allowed us 
to establish first order necessary conditions for a solution of the bilevel problem. These conditions 
give us hope that the algorithms will be of practical use. Our next step is thorough practical testing 
of the algorithms combined with further theoretical investigations. 
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